home *** CD-ROM | disk | FTP | other *** search
- This README contains a manifest of what exactly is contained in this
- distribution, a description of what the system does, and concise
- instructions for configuring the system. A man page that will explain
- everything in greater detail is currently being written and will be
- released shortly.
-
- Manifest:
- --------
-
- README - this file.
- 8mmbackup - driver script, callable from crontab or 'by hand'.
- fullPrimer - link to 8mmbackup, used to prime system for new backup cycle.
- fnFilter.awk - ancillary function used by 8mmbackup, not directly by user.
- join.bug - an interesting bug in join that prevented me from using it for
- a perfect purpose...
-
- What 8mmbackup does:
- -------------------
-
- This distribution contains two scripts that together implement a
- fairly sophisticated backup mechanism for use with a very high
- capacity backup tape device (originally an 8mm exabyte tape drive, as
- the name suggests). It's oriented to take advantage of the capacity
- of such backup devices, implementing minimally-intrusive backups such
- that system access time and operator management time is kept to a
- dramatic minimum. No restrictions on access to the host system are
- necessary during backups and the operator only needs to be involved to
- configure the setup and change tapes between each cycle (which can
- last for multiple weeks).
-
- The system is essentially an elaborate frontend for creating cpio
- archives on a high-capacity tape device, managing scheduled backup
- cycles consisting of an exhaustive ('full') backup followed by
- numerous 'incrementals', which include just those files that have been
- modified or created since the prior backup run. The configurer
- designates what directory hierarchies in the file system to save and
- what parts to prune from those directory hierarchies. Comprehensive
- online audit tracing and registration of preserved files is
- maintained, as well as tape-capacity monitoring and broad operational
- error checking and recovery, to ensure that minimal operator
- intervention does not mean missed and neglected problems.
-
- What 8mmbackup does not do:
- --------------------------
-
- 8mmbackup provides much of the power, with a number of important
- advantages, of the standard bsd 4.2 'dump'/'restore' with one important
- exception. 8mmbackup does *not* track deletion of files across
- incremental backups, so that files that are deleted within the course
- of a backup cycle may be resurrected when recovery of the cycle is
- effected, even when the recovery goes to a point after the file was
- deleted. The full backups at the beginning of each cycle contain only
- what is in the target file system when they are taken, and so establish
- a clean, synchronizing image of the target file system.
-
- Cpio is used directly to ressurect from the archives created by
- 8mmbackup. The online accounting created during backup provides
- comprehensive information for locating particular files within each
- backup cycle, and the accounting information for prior cycles are
- available from the full backup at the front of the subsequent cycle,
- so it is always easy to pinpoint a file or set of target files. The
- accounting information is fairly self-explanatory, you can figure out
- what you need to pinpoint any particular file or what has changed on
- what date by looking through them, and until the document describing
- these things is done that is what you're going to have to do.
-
- Configuration:
- -------------
-
- Configuration of 8mmbackup is simple and straightforward. It entails
- setting some variables the 8mmbackup script and creating an entry in
- the system crontab for periodic activation of the system, all of which
- i describe below. From then on operation of the system is automatic,
- except for initialization of each new backup cycle. This consists of
- invoking 'fullPrimer' and replacing the current cycle's tape with a
- fresh one. This does not immediately activate a full backup, but
- instead primes the system so that the next backup (either manual or
- via cron, etc) will be an exhaustive 'full' dump. (The effect of
- fullPrimer can be cancelled by invoking it again with an 'inhibit'
- arg. Put back in the old tape and use this command if you decide you
- want to continue with the incrementals for the current cycle but
- you've already prepared the system for a new cycle.)
-
- The comments at the beginning of the 8mmbackup script detail
- everything you need to know to configure the script.
-
- The following examples in the configurations section assume location of
- the scripts in a directory called /usr/local/lib/8mmbackup . They can
- be put instead in an arbitrary directory but will require sufficient
- room in the file system where they reside to hold the accounting
- information they maintain, which by very rough estimate amounts to
- about .013% the volume of the target filesystems when using an
- exhaustive registry (see configuration of 8mmbackup below) or about
- .00003% of the volume of the target file systems when not using an
- exhaustive registry.
-
- Here are the specifics:
-
- Configuring the 8mmbackup script:
- --------------------------------
- There is a region in the script that is set aside for definitions of
- user-configured variables. It is delimited above by;
-
- #vvvvvvvvvvvvvvvvv User designated (configuration) variables vvvvvvvvvvvvvvvvvv
-
- and below by:
-
- #^^^^^^^^^^^^^^^^^^ User designated (configuration) variables ^^^^^^^^^^^^^^^^^
-
- As it says in the script just after the top delimiter, this section
- should be reproduced in a seperate file named by the 'outboardConfig'
- variable in this section. Set the 'outboardConfig' and 'scriptDir'
- vars in the main script and then duplicate the entire section in the
- outboardConfig file.
-
- Here are some hints about configuring the script.
-
- Each variable definition is preceded by an explanation in comments and
- a default value, some of which will almost certainly have to be
- changed for your site, and some of which are probably best left as
- they are. Here's the variables one by one:
-
- 'scriptDir': the directory where the backup scripts (including this
- one) reside.
-
- Example: 'set scriptDir=/usr/local/lib/8mmbackup'
-
- 'outboardConfig': a config file you can create that will override all
- settings (except 'scriptDir' and 'outboardConfig' settings themselves)
- in the main script. Useful to easily accomodate new versions of the
- script without losing your site-specific configurations.
-
- 'debug': Generally left undefined, setting this variable causes the
- script to run in a dry-run mode, with no accounting information or tape
- backups actually getting written. Setting it to a null value is useful
- for diagnosing where problems occur during a run, and setting it to the
- value 'verbose' is useful for seeing exactly what happens during the
- run - perhaps seeing more than you want to, however. Verbose is very
- verbose. In either case no accounting information or tape backup will
- be written.
-
- Example: 'set debug' or 'set debug=verbose'
- As shipped: '#set debug' (ie, inhibited)
-
- 'subjPaths': a list of the top of directory hierarchies that you want
- backed up. Wild cards can be used here. 'excludePaths' and
- 'excludeBaseNames' (described below) can be used to prune branches out
- of the designated target hierarchies.
-
- Example: 'set subjPaths=(/ /usr/spool/{mail,mqueue})'
-
- 'excludePaths': specific rooted paths to directories whose contents
- are to be excluded from backup. Wild cards are also valid here. Note
- that it will *not* help to include in 'subjPaths' (above) a path to a
- directory that is contained somewhere in a directory hierarchy
- included in 'excludePaths' - that subdirectory will still be excluded.
- Any paths that start with a path specified in 'excludePaths' (and
- their unravelled equivalents; see next paragraph) will be excluded
- from backup. However, symbolic links from elsewhere to a file
- actually residing in an excluded path will not cause the files to be
- saved - the links themselves will be, but not the files. Hard links
- from non-excluded paths, however, will cause the files to be saved.
-
- Example: 'set excludePaths=(/usr/man/cat* /remote/{poobah,humbug,mogul})'
- As shipped: 'set excludePaths=(/usr/man/cat*)'
-
- One other note about these two path variables - any symbolic links
- mentioned in them are first 'unravelled' to the hard directories that
- they denote so that the path to the actual directory will be used as
- well as the path to the link.
-
- 'excludeDirs': pathless names of those directories whose contents
- should always be excluded from backup, eg 'tmp'. In contrast to the
- way 'excludePaths' works, you *can* make contents of 'excludeDirs'
- excluded directories get included in backup by explicitly specifying
- them in 'subjectPaths'. For instance, we exclude 'spool' directories
- via 'excludeDirs' but explicitly add '/usr/spool/{mail,mqueue}' as
- part of our subject paths. This way the mail related spool
- directories on the backup machine are preserved, but other spool
- directories are not.
-
- Example: 'set excludeDirs=(tmp newsgroups spool lost+found)'
-
- 'exhaustiveRegistry': A registry of preserved files is maintained
- which indicates the paths of the files preserved within the current
- backup cycle and the specific backups during which they were
- preserved. 'exhaustiveRegistry' controls whether that file is to
- contain only the paths of files that have been modified since the
- prior backup (full or incremental regardless) or whether it is to
- contain a roster of all the files. The distinction only concerns
- those files preserved during the full backup that had not been
- modified since the last backup of the prior cycle. With
- 'exhaustiveRegistry' set all the paths are registered and without it
- set only the recently modified ones are registered - this doesn't
- affect what gets saved, only what gets registered. The rationale for
- having an exhaustive registry is obvious - the registry reflects
- what's on tape. The rationale against has to do with online space
- savings (about a factor of 500 in size, or 40 kilobytes for a 1
- gigabyte target file hierarcy for non-exhaustive vs about 20 megabytes
- for exhaustive) from a smaller registry. I'm quite comfortable with a
- non-exhaustive registry, but a number of people around here elect for
- an exhaustive one. In either case the registry entries are marked to
- indicate during which backup(s) in the cycle they were preserved and
- whether they had been recently modified.
-
- Example: 'set exhaustiveRegistry'
- As shipped: '#set exhaustiveRegistry' (inhibited)
-
- 'realDev': the device that should receive the backup. This should
- always be a no-rewind tape device (ie, one that can be manipulated with
- 'mt' commands). Exabyte drives as configured with Perfect Byte
- software are '/dev/nrrt0' - 'rt0' being the device and 'nr' being the
- no-rewind specification.
-
- Example: 'set realDev=/dev/nrrt0'
- As shipped: 'set realDev=/dev/nrst1'
-
- 'totalCap': total capacity (in bytes) that a single backup device
- volume (tape) will hold. We use 2200000000 (2.2 billion) for the
- exabyte drive.
-
- Example: 'set totalCap=2200000000'
-
- 'errorNoticeTo': a username that any backup error notifications should
- be sent to. You may want to establish a mail alias on your backup
- server that can be directed to a few users and be visible as a general
- resource, like with the 'Postmaster' alias.
-
- As shipped: 'set errorNoticeTo=backupmaster'
-
- 'verifyTolerance': If set a fairly stringent but time consuming
- validation of the tape archive is done at the end of processing,
- If unset a much less assured but much faster verification is done.
- See the notes in the configuration section for more details.
-
-
- That is all the user-configured variables in the 8mmbackup script -
- nothing else should be changed there unless you want to really hack on
- it...
-
- Commissioning 8mmbackup for scheduled execution:
- -----------------------------------------------
- 8mmbackup is designed to be run either as a shell command from a
- terminal or automatically by invocation from the 'cron' service.
- Normal operation is generally via 'cron', with operator intervention
- to prime the system for full backup, by (1) changing the tape, and (2)
- invoking fullPrimer. The next time 8mmbackup is invoked (with no
- specific arguments) it will do a full backup. Thus a single cron entry
- will suffice for both full and incremental backups.
-
- Assuming location of the scripts in a /usr/local/lib/8mmbackup
- directory, a crontab entry that runs 8mmbackup every evening at 10:45
- would be:
-
- 45 22 * * * /usr/local/lib/8mmbackup/8mmbackup
-
- Reading the accounting:
- ----------------------
-
- If the configuration variable 'exhaustiveRegistry' is left unset in
- the configuration, the accounting file 'registry' details either just
- those files in the current backup cycle that were modified since the
- previous backup, along with the specific sequence id numbers of the
- the backups in the cycle they were taken on. With 'exhaustiveRegistry'
- set the registry includes the pathname of every file that was included,
- with all the files included in the full marked with an underscore ('_')
- and those files included in the full but and modified distinguished
- from the others by a '0' sequence id. This may sound complicated, but
- it's obvious when you look at the registry.
-
- The accounting file 'backup.log' details the trace of each backup run.
- backup.log is not reinitted at the beginning of each backup cycle.
-
- The accounting file 'seqStats' details the association between sequence
- id and date, archive size, and any other pertinent parameters of each
- run. SeqStats is reinitted at the beginning of each backup cycle.
-
-
- Reading the archives:
- --------------------
- I define some aliases in the 8mmbackup script that exemplify a general
- way to access the 8mmdevice -
-
- alias 8mmin dd if='\!\!:1' ibs=5120
- alias 8mmout dd of='\!\!:1' obs=5120
-
- If you want to read in an archive, change to a directory where you
- want the paths that you're resurrecting to be rooted. *Note* that the
- entire path will come in, so, eg, if you restore in to the root
- directory the files will overwrite any existing versions of them. In
- general it's best to resurrect in a temporary directory where you
- won't unintentionally overwrite something. The path of the directory
- where you execute the cpio prepended to the entire path where they
- exist.
-
- Using the aliases above, position to the archive you want to read
- using 'mt' - eg, for the seventh backup in device /dev/nrst1:
-
- mt -f /dev/nrst1 fsf 7
- cd /enough/space/tmp
- dd if=/dev/nrst1 ibs=5120 | cpio -ivd "*/spiffy/path/stuff/*"
- [numerous messages]
-
-
-
-
- klm@cme.nist.gov
-